Goto

Collaborating Authors

 contact relationship


Implicit Contact Diffuser: Sequential Contact Reasoning with Latent Point Cloud Diffusion

arXiv.org Artificial Intelligence

Long-horizon contact-rich manipulation has long been a challenging problem, as it requires reasoning over both discrete contact modes and continuous object motion. We introduce Implicit Contact Diffuser (ICD), a diffusion-based model that generates a sequence of neural descriptors that specify a series of contact relationships between the object and the environment. This sequence is then used as guidance for an MPC method to accomplish a given task. The key advantage of this approach is that the latent descriptors provide more task-relevant guidance to MPC, helping to avoid local minima for contact-rich manipulation tasks. Our experiments demonstrate that ICD outperforms baselines on complex, long-horizon, contact-rich manipulation tasks, such as cable routing and notebook folding. Additionally, our experiments also indicate that \methodshort can generalize a target contact relationship to a different environment. More visualizations can be found on our website $\href{https://implicit-contact-diffuser.github.io/}{https://implicit-contact-diffuser.github.io}$


Learning Multi-Step Manipulation Tasks from A Single Human Demonstration

arXiv.org Artificial Intelligence

Learning from human demonstrations has exhibited remarkable achievements in robot manipulation. However, the challenge remains to develop a robot system that matches human capabilities and data efficiency in learning and generalizability, particularly in complex, unstructured real-world scenarios. We propose a system that processes RGBD videos to translate human actions to robot primitives and identifies task-relevant key poses of objects using Grounded Segment Anything. We then address challenges for robots in replicating human actions, considering the human-robot differences in kinematics and collision geometry. To test the effectiveness of our system, we conducted experiments focusing on manual dishwashing. With a single human demonstration recorded in a mockup kitchen, the system achieved 50-100% success for each step and up to a 40% success rate for the whole task with different objects in a home kitchen. Videos are available at https://robot-dishwashing.github.io


New Artificial Intelligence System Enables Machines That See the World More Like Humans Do

#artificialintelligence

A new "common-sense" approach to computer vision enables artificial intelligence that interprets scenes more accurately than other systems do. Computer vision systems sometimes make inferences about a scene that fly in the face of common sense. For example, if a robot were processing a scene of a dinner table, it might completely ignore a bowl that is visible to any human observer, estimate that a plate is floating above the table, or misperceive a fork to be penetrating a bowl rather than leaning against it. Move that computer vision system to a self-driving car and the stakes become much higher -- for example, such systems have failed to detect emergency vehicles and pedestrians crossing the street. To overcome these errors, MIT researchers have developed a framework that helps machines see the world more like humans do. Their new artificial intelligence system for analyzing scenes learns to perceive real-world objects from just a few images, and perceives scenes in terms of these learned objects.


Machines that see the world more like humans do

#artificialintelligence

Computer vision systems sometimes make inferences about a scene that fly in the face of common sense. For example, if a robot were processing a scene of a dinner table, it might completely ignore a bowl that is visible to any human observer, estimate that a plate is floating above the table, or misperceive a fork to be penetrating a bowl rather than leaning against it. Move that computer vision system to a self-driving car and the stakes become much higher --for example, such systems have failed to detect emergency vehicles and pedestrians crossing the street. To overcome these errors, MIT researchers have developed a framework that helps machines see the world more like humans do. Their new artificial intelligence system for analyzing scenes learns to perceive real-world objects from just a few images, and perceives scenes in terms of these learned objects. The researchers built the framework using probabilistic programming, an AI approach that enables the system to cross-check detected objects against input data, to see if the images recorded from a camera are a likely match to any candidate scene.


Machines that see the world more like humans do

#artificialintelligence

Computer vision systems sometimes make inferences about a scene that fly in the face of common sense. For example, if a robot were processing a scene of a dinner table, it might completely ignore a bowl that is visible to any human observer, estimate that a plate is floating above the table, or misperceive a fork to be penetrating a bowl rather than leaning against it. Move that computer vision system to a self-driving car and the stakes become much higher -- for example, such systems have failed to detect emergency vehicles and pedestrians crossing the street. To overcome these errors, MIT researchers have developed a framework that helps machines see the world more like humans do. Their new artificial intelligence system for analyzing scenes learns to perceive real-world objects from just a few images, and perceives scenes in terms of these learned objects. The researchers built the framework using probabilistic programming, an AI approach that enables the system to cross-check detected objects against input data, to see if the images recorded from a camera are a likely match to any candidate scene.